computation path
Programs as Singularities
We develop a correspondence between the structure of Turing machines and the structure of singularities of real analytic functions, based on connecting the Ehrhard-Regnier derivative from linear logic with the role of geometry in Watanabe's singular learning theory. The correspondence works by embedding ordinary (discrete) Turing machine codes into a family of noisy codes which form a smooth parameter space. On this parameter space we consider a potential function which has Turing machines as critical points. By relating the Taylor series expansion of this potential at such a critical point to combinatorics of error syndromes, we relate the local geometry to internal structure of the Turing machine. The potential in question is the negative log-likelihood for a statistical model, so that the structure of the Turing machine and its associated singularity is further related to Bayesian inference. Two algorithms that produce the same predictive function can nonetheless correspond to singularities with different geometries, which implies that the Bayesian posterior can discriminate between distinct algorithmic implementations, contrary to a purely functional view of inference. In the context of singular learning theory our results point to a more nuanced understanding of Occam's razor and the meaning of simplicity in inductive inference.
- Workflow (0.67)
- Research Report > New Finding (0.47)
Unambiguity and Fewness for Nonuniform Families of Polynomial-Size Nondeterministic Finite Automata
Nonuniform families of polynomial-size finite automata, which are series of indexed finite automata having polynomially many inner states, are used in the past literature to solve nonuniform families of promise decision problems. Among such nonuniform families of finite automata, we focus our attention, in particular, on the variants of nondeterministic finite automata, which have at most "one" (unambiguous), "polynomially many" (few) accepting computation paths, or unambiguous/few computation paths leading to each fixed configuration. When such machines are limited to make only one-way head moves, we can prove with no unproven hardness assumptions that some of these variants are different in computational power from each other. As for two-way machines restricted to instances of polynomially-bounded length, families of two-way polynomial-size nondeterministic finite automata are equivalent in power to families of polynomial-size unambiguous finite automata.
- Europe > Poland > Masovia Province > Warsaw (0.04)
- Europe > Germany > Rhineland-Palatinate > Kaiserslautern (0.04)
- Asia > Japan (0.04)
Learning Narrow One-Hidden-Layer ReLU Networks
Chen, Sitan, Dou, Zehao, Goel, Surbhi, Klivans, Adam R, Meka, Raghu
We consider the well-studied problem of learning a linear combination of $k$ ReLU activations with respect to a Gaussian distribution on inputs in $d$ dimensions. We give the first polynomial-time algorithm that succeeds whenever $k$ is a constant. All prior polynomial-time learners require additional assumptions on the network, such as positive combining coefficients or the matrix of hidden weight vectors being well-conditioned. Our approach is based on analyzing random contractions of higher-order moment tensors. We use a multi-scale analysis to argue that sufficiently close neurons can be collapsed together, sidestepping the conditioning issues present in prior work. This allows us to design an iterative procedure to discover individual neurons.
Tracing the Propagation Path: A Flow Perspective of Representation Learning on Graphs
Wang, Menghan, Zhang, Kun, Li, Gulin, Yang, Keping, Si, Luo
Graph Convolutional Networks (GCNs) have gained significant developments in representation learning on graphs. However, current GCNs suffer from two common challenges: 1) GCNs are only effective with shallow structures; stacking multiple GCN layers will lead to over-smoothing. 2) GCNs do not scale well with large, dense graphs due to the recursive neighborhood expansion. We generalize the propagation strategies of current GCNs as a \emph{"Sink$\to$Source"} mode, which seems to be an underlying cause of the two challenges. To address these issues intrinsically, in this paper, we study the information propagation mechanism in a \emph{"Source$\to$Sink"} mode. We introduce a new concept "information flow path" that explicitly defines where information originates and how it diffuses. Then a novel framework, namely Flow Graph Network (FlowGN), is proposed to learn node representations. FlowGN is computationally efficient and flexible in propagation strategies. Moreover, FlowGN decouples the layer structure from the information propagation process, removing the interior constraint of applying deep structures in traditional GCNs. Further experiments on public datasets demonstrate the superiority of FlowGN against state-of-the-art GCNs.
- Oceania > Australia > New South Wales > Sydney (0.04)
- North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
- North America > Canada > Alberta > Census Division No. 15 > Improvement District No. 9 > Banff (0.04)
- (2 more...)
- Information Technology > Communications (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.69)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)
Visualizing the decision-making process in deep neural decision forest
Li, Shichao, Cheng, Kwang-Ting
Deep neural decision forest (NDF) achieved remarkable performance on various vision tasks via combining decision tree and deep representation learning. In this work, we first trace the decision-making process of this model and visualize saliency maps to understand which portion of the input influence it more for both classification and regression problems. We then apply NDF on a multi-task coordinate regression problem and demonstrate the distribution of routing probabilities, which is vital for interpreting NDF yet not shown for regression problems. The pre-trained model and code for visualization will be available at https://github.com/Nicholasli1995/VisualizingNDF
Deep Hierarchical Machine: a Flexible Divide-and-Conquer Architecture
Li, Shichao, Yang, Xin, Cheng, Tim
We propose Deep Hierarchical Machine (DHM), a model inspired from the divide-and-conquer strategy while emphasizing representation learning ability and flexibility. A stochastic routing framework as used by recent deep neural decision/regression forests is incorporated, but we remove the need to evaluate unnecessary computation paths by utilizing a different topology and introducing a probabilistic pruning technique. We also show a specified version of DHM (DSHM) for efficiency, which inherits the sparse feature extraction process as in traditional decision tree with pixel-difference feature. To achieve sparse feature extraction, we propose to utilize sparse convolution operation in DSHM and show one possibility of introducing sparse convolution kernels by using local binary convolution layer. DHM can be applied to both classification and regression problems, and we validate it on standard image classification and face alignment tasks to show its advantages over past architectures.
Tensorflow: The Confusing Parts (1) Buckman's Homepage
Click here to skip the intro and dive right in! When I started the residency program in the summer of 2017, I had a lot of experience programming, and a good understanding of machine learning, but I had never used Tensorflow before. I figured that given my background I'd be able to pick it up quickly. To my surprise, the learning curve was fairly steep, and even months into the residency, I would occasionally find myself confused about how to turn ideas into Tensorflow code. I'm writing this blog post as a message-in-a-bottle to my former self: it's the introduction that I wish I had been given before starting on my journey. Hopefully, it will also be a helpful resource for others. In the three years since its release, Tensorflow has cemented itself as a cornerstone of the deep learning ecosystem.